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Abstract 

A file synchronizer restores consistency after multiple repli- 
cas of a filesystem have been changed independently. We 
present an algebra for reasoning about operations on filesys- 
tems and show that it is sound and complete with respect 
to a simple model. The algebra enables us to specify a 
file-synchronization algorithm that can be combined with 
several different conflict-resolution policies. By contrast, 
previous work builds the conflict-resolution policy into the 
specification, or worse, does not specify the synchronizer's 
behavior precisely. We classify synchronizers by asking 
whether conflicts can be resolved at a single disconnected 
replica and whether all replicas are identical after synchro- 
nization. We also discuss timestamps and argue that there 
is no good way to propagate timestamps when there is se- 
vere clock skew between replicas. 

1. Introduction 

What is a file synchronizer? Suppose there are multiple 
replicas of a filesystem; perhaps you have one on a server, 
one on a computer at home, and one on a laptop. If you 
make different changes at different replicas, the replicas no 
longer contain the same information. A file synchronizer 
makes them consistent again, while preserving changes you 
made. 

Not every set of replicas can be made consistent auto- 
matically. For example, if src/hello . c is created to say 
"Hello, world" on one replica and "Hello, Dolly" on an- 
other replica, it is not obvious how to choose one or the 
other. In cases like these, the file synchronizer needs a pol- 
icy for conflict resolution. Reasonable people might differ 
about what constitutes a good policy; some alternatives 
appear in Section 6. 

The behaviors of many synchronizers are not specified 
precisely; understanding how they detect and resolve con- 
flicts can be difficult. Balasubramaniam and Pierce (1998) 



represents a major step forward; it specifies formal require- 
ments for a file synchronizer, and it derives an algorithm 
from those requirements. This algorithm is implemented in 
the Unison file synchronizer. 

Unison's specification is based on reasoning about states 
of the file system before and after synchronization. This 
state-based approach leads to an unnecessarily narrow view 
of conflicts. Balasubramaniam and Pierce (1998) actually 
builds the conflict-resolution policy into the specification, 
making it unclear how to implement an interesting class of 
conflict-resolution policies. 

We have taken a different approach to specification of 
file synchronizers; as advocated by Lippe and van Oost- 
erom (1992), we reason not about states but about the op- 
erations that are performed at each replica. This paper 
makes the following contributions: 

• We present an algebra of filesystem operations, together 
with algebraic laws that are helpful both for reasoning 
about file synchronization and for implementing synchro- 
nizers. 

• We show that the laws are sound and complete with 
respect to a semantic model of file systems. 

• We explain conflict detection and resolution in terms of 
our algebra, and we show that our technique detects es- 
sentially the same conflicts as the state-based technique 
of Balasubramaniam and Pierce (1998). 

• We identify useful alternatives for conflict resolution, in- 
cluding alternatives that enable users to recover from 
conflicts by making changes at a single replica. 

The paper demonstrates the value of formal approaches to 
practical problems. An algebraic approach can simplify the 
specification, implementation, and user interface of a file 
synchronizer. It may also be possible to extend algebraic 
techniques to other synchronization problems, such as mail 
folders or PalmOS databases. 

2. Formalizing the problem 

We consider the synchronization of n replicas of a filesys- 
tem F, numbered F\, . . . ,F n . Initially all replicas are iden- 
tical: F = F\ = • • • = F n . At each replica, users and pro- 
grams perform operations on the filesystem. We write Si 
for the sequence of operations performed at replica i. The 
task of the file synchronizer is to compute, for each replica, 
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a sequence S* that makes the replicas consistent and ac- 
counts for all the operations performed at each replica. If 
there are no conflicts, all replicas reach the same new state 
F pos t = SUSi(F!)) =■■■ = S*{S n {F n )), where we take se- 
quences of operations to act as functions on the state of a 
filesystem. 

If order of operations didn't matter, we could simply com- 
pute S = SiU&U- ■ -USn and let S* = S\Si. Because order 
does matter, however, we have to do more work. The prob- 
lem comes from pairs of commands that don't commute; 
if Ci ; C2 has a different effect from C2 ; Ci , not all orders 
are equivalent. The Introduction contains an example of 
such a pair of commands; if Ci writes "Hello, world" and 
C2 writes "Hello, Dolly", the last writer wins. 

If operations were totally ordered, the problem might still 
be fairly simple; we would compute the list of all oper- 
ations in the proper order, then arrange for the state of 
each replica to be as if that list of operations had been 
performed. Operations at an individual replica are totally 
ordered, but unfortunately we can't order operations be- 
tween replicas. Even if we could guarantee consistency of 
timestamps, we wouldn't want to use timestamp ordering, 
because the agents (users and programs) that perform op- 
erations make decisions about what operations to perform 
by consulting only the states of their local replicas. Agents 
can't make decisions based on the results of operations per- 
formed at remote replicas, even if those actions have already 
taken place according to some global clock. 

We frame the problem of file synchronization as first find- 
ing the set S of all operations that have been performed, 
then computing a useful subset of S such that within the 
subset, all global orderings that are consistent with the local 
orderings have the same effect. Using this subset, we can 
compute the sequences of commands S* to be applied at 
each replica. In more detail, we can synchronize replicas in 
three steps: 

1. Update detection examines each replica to determine the 
sequence of commands Si that have been executed at the 
replica. 

2. Reconciliation takes as many commands as possible from 
the sequences Si and computes the sequences S* to be 
executed at each replica. 

3. Conflict resolution takes the leftover, "conflicting" com- 
mands and does something with them. 

Our approach simplifies reasoning about all three steps, and 
in the third step it offers a significant advance over previ- 
ous work: reasoning about commands makes it possible to 
devise several conflict-resolution strategies. 

3. A precise model of £lesystems 

We model a hierarchical filesystem in which paths refer to 
files and directories. A path is a sequence of names. We use 
Greek letters for paths, most commonly n. Following Unix 
conventions, we use the / character to separate names in a 
path, and we write / for the empty path. We write tt -< 7 
iff 7r is a prefix of 7, i.e., if 7 = n/a for some path a, which 
might be empty. We write it -< 7 if n is a proper prefix of 7, 
that is, 7r ^< 7 and n 7^ 7. In filesystem terms, n -< 7 means 
that 7r is an ancestor directory of 7. If n ^ 7 and 7 -fc it, 
we say that it and 7 are incomparable. It is a fundamental 



property of hierarchical file systems that operations taking 
place at incomparable paths are independent. 

We write parent(n) for the path that immediately pre- 
cedes it. That is, if tt is not empty, there is a name n such 
that it = parent(yr) /n. The empty path has no parent. 

We model a working filesystem F as a partial function 
mapping paths to files and directories. We write F(ir) to 
refer to the file or directory at path tt in filesystem F. For 
the contents of a filesystem, we write 

F(ji) = FlLE(m, x) when path tt contains a file with 
metadata m and contents x; 

F(ir) = Dm(m) when path tt contains a directory 
with metadata m; 

F(tt) = _L when filesystem F contains 

nothing at path tt; _L is 
pronounced "missing." 
Metadata may include permissions, ownership, modifica- 
tion time, etc., but the metadata of a directory explicitly 
does not include information about the directory's children; 
that information is encoded in F. We write F(n) = X when 
we know F(ir) / _L but we don't care if we're dealing with 
a file or a directory. 

Our model also includes the broken filesystem, which we 
write F = _L, pronounced "broken." The broken filesystem 
models the result of an erroneous command, e.g., deleting 
a directory with files under it. Broken filesystems don't oc- 
cur in practice, because the operating system prevents users 
from breaking the filesystem. It is nevertheless useful to in- 
clude the broken filesystem in the model, because it enables 
reasoning about errors. E.g., if a sequence of commands 
produces the broken filesystem, a program attempting to 
execute those commands will fail with an error. 

Our model does not include hard or soft links. 

We use a trivial lattice ordering of filesystems in which 
the broken filesystem is the bottom element. We write 
the lattice ordering Fi C F2, pronounced "Ji approxi- 
mates F2." This relation holds whenever F\ — _L or when 
Fi and F2 are pointwise equal functions, i.e., Fi 7^ _L and 
F2 7^ -L and Vn.Fi(w) = ^(w). 1 The jZ relation is a partial 
order, so two filesystems approximate each other if and only 
if they are equal. 

To explain changes to working filesystems, we write 
F{tt 1— » X} for the function that is like F, except it maps 
TT to X. 

Fi--*X7)4 X ' if7r=7 

(F(j), otherwise 

We write childless f(tt) iff F(tt) has no descendants, i.e., 
V7 : tt -< 7 =>- F(-y) = _L. 

4. An algebra of commands 

What commands should we use to model operations on a 
filesystem? Because users must understand what a syn- 
chronizer is doing, our algebra of commands should be con- 
sistent with users' mental models of the actions they and 

1 Readers familiar with dcnotational semantics should note 
that our ordering is not the ordering typically used for functions; 
in particular, if one working filesystem approximates another, 
they are identical. 
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their agents perform on the filesystem. Users might imag- 
ine performing operations like these: 
create(ir, X) Create file or directory X at tt. 
remove{ir) Remove the file or directory that was at tt. 
rename(ir,n) Change the "base name" of a file or 
directory to n, while leaving it in the 
same place in the hierarchy. 
move(rr, 7r') Move tt to tt' , also moving all 
descendants. 

derive^) Change an existing file or directory, in a 

way that could be reproduced 
mechanically. Because the result can be 
reproduced, the operation need not say 
what the final state is. An obvious 
example is compiling a source to produce 
a binary. 

edit(iv, X) Change an existing file or directory, 

leaving it in state X, in a way that can't 
be reproduced mechanically. 
The distinction between edit and derive is useful because 
a user may wish to specify a behavior like "don't synchro- 
nize derived files." We distinguish create from edit because 
although both operations have the same postcondition (file 
with new metadata and contents), they have different pre- 
conditions, so the distinction may help detect errors. Ac- 
cordingly, we specify that to create an existing file, or to 
edit a nonexistent file, leaves the filesystem broken. 

These high-level operations may be a good model for 
users, but they are not so good for deriving synchroniza- 
tion algorithms. We simplify. 

• Conceptually at least, move can subsume rename, as it 
does in the Unix system (but not in early versions of 
DOS). 

• Derive can't be distinguished from edit without knowl- 
edge about how files are derived. To avoid synchronizing 
derived files, we would be better off with a more gen- 
eral mechanism for making files "invisible to the syn- 
chronizer." We therefore drop derive. 

• Finally, although it is not clear a priori, the move oper- 
ation makes it more difficult to reason about synchro- 
nization. The crux of the problem is that the move 
operation affects two different locations in the filesys- 
tem, whereas the other operations affect only one. Ac- 
cordingly, we replace move(ir,Tr') with the sequence 
remove (tt); create(ir'). The Unison synchronizer does the 
same. (A move can also be difficult to detect, but that 
is not sufficient reason to omit it from the algebra.) 

Figure f shows how these operations change the contents 
of a filesystem at path tt. Using fewer operations simplifies 
synchronization but complicates a synchronizer's user in- 
terface. Section 6 explains how to recover a high-level view 
for interacting with users. 

Precise de£nitions of the commands 

We define the effect of each command as a function from 
filesystems to filesystems. Any command applied to a bro- 
ken filesystem produces a broken filesystem. In the lan- 
guage of denotational semantics, every command is strict 
in the filesystem. Operationally, once a filesystem is bro- 
ken, there is no way to fix it. Figure 2 gives the effects 
of commands on working filesystems. The command break 
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is not one we expect to use during synchronization, but it 
helps us reason about errors. In particular, by showing that 
a sequence of commands is not equivalent to break, we can 
show those commands can be executed without error on at 
least one filesystem. 

We are interested only in filesystems that satisfy the tree 
property: every parent must be a directory. Formally, if 
tt -< 7 and F(^y) _L then F(tt) = Dm.(m) for some m. 
The commands in Figure 2 maintain the tree property as 
an invariant. 

The commands have another property that simplifies rea- 
soning. Each command mentions at most one path tt, and 
if a command is applied to a working filesystem, either it 
breaks the filesystem or it changes the filesystem only at tt. 

Algebraic laws 

Our synchronization algorithm relies on proofs that differ- 
ent sequences of operations can have the same effects. We 
could construct such proofs by using the precise definitions 
of the commands in Figure 2, but it is much easier to reason 
using algebraic laws than to reason directly about mathe- 
matical functions. This section presents the major tech- 
nical contribution of this paper: algebraic laws that form 
part of a sound and complete proof system for reasoning 
about sequences of commands. This proof system appears 
in Table f; in addition to algebraic laws, which enable us 
to rewrite pairs of commands, the proof system includes in- 
ference rules for substitution and transitivity, which enable 
us to extend the rewriting to larger sequences. 

We write commands in a sequence separated by semi- 
colons. These sequences stand for functions from filesys- 
tems to filesystems, as described by this equation: 

(Ci;C r 2 )(F) = C 2 (Ci(F)). 

We write S for a sequence of commands, and we write skip 
for the empty sequence of commands, i.e., the identity func- 
tion on filesystems. 

Although we want to reason about equivalence, the cen- 
tral relation of our algebra is not equivalence but approx- 
imation. To understand why, consider a sequence of two 
commands: one that creates a file, and a second that re- 
moves it. You might think this sequence is equivalent to 
skip: 

? 

create (tt, X); remove^) = skip. 

Look again; the initial create operation is not safe on all 
file systems. If tt is already present, or if 7r's parent is not 
a directory, create(iT, S) breaks the filesystem. The correct 
relation between these two sequences is this: 

create(rr, X); remove^) C skip. 
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create(-K, X) F 

edit(n, Dffi(m)) F 

edit(ir, FiLE(m, x)) F 

remove(ir) F 
break F 



F{ty i-> X}, 
■J-, 

F{n i-» Dm(m)}, 
-1, 



iff F / 1 A F(tt) = _L A F(parent(-K)) = Dm(- 
otherwise 

iff F / 1 A F(tt) ^ _L 
otherwise 



F{tt h-> FlLE(m,a;)}, iff F / 1 A F(tt) / 1 A childless f (tt) 
■J-, 



F{tt h-> _L}, 
-1, 



otherwise 

iff F / 1 A F(tt) / _L A childless F (n) 
otherwise 



_L 



Figure 2: Filesystem operations and their semantics 



We pronounce 5*1 C 52 as "Si approximates 52," or some- 
times "52 is at least as good as Si." The intended in- 
terpretation is that we can use 52 in place of 5i without 
breaking more filesystems and without changing working 
outcomes. Frequently of course, two sequences are com- 
pletely equivalent; we write 5i = 52 as an abbreviation for 
5i C 5 2 A 5 2 CSi. Most of the laws in Table 1 do in fact 
use equivalence; laws using approximation are marked with 
the jZ symbol. 

We have organized Table 1 to show that we have con- 
sidered all possible pairs of operations. There are 7 pairs 
involving break. These pairs lead to laws 37-43, which are 
consistent with Figure 2; once a filesystem is broken, no 
operation can fix it, and we know nothing about what hap- 
pened before it broke. 

There are 9 pairs of operations not involving break. Each 
such operation mentions exactly one path, and when we 
have a pair of paths 7Ti and 7T2, there are four cases to be 
considered depending on the values of 7Ti X tt 2 and 7r 2 ■< 7Ti : 

7Ti ;< 7T 2 7T 2 ^ ""I How We Write 7Tl , 7T 2 



T 
T 
F 
F 



T 
F 
T 
F 



7T, TV 

7r, n/n' 

tv/tt', tt 

7T, ip 



These combinations account for 36 pairs of operations and 
paths, and for the laws numbered 1-36. Laws 3-6 are fur- 
ther split into D and F forms to account for the difference 
in semantics between directories and files. For example, 
law 5D says that making n a directory commutes with re- 
moving a descendant of 7T, but law 5F says that making 
7T a file and then removing a descendant always causes an 
error. 2 We summarize the proof system as follows: 

• Laws 1-2 and 3D-6D say what operations involving a 
directory and its descendant commute. 

• Laws 7-15 say that operations involving incomparable 
paths commute. 

• Laws 16-29 and 3F-5F say that operations which violate 
preconditions break the filesystem. 



2 Either n originally had no descendants, in which case trying 
to remove one is an error, or it did have descendants, in which 
case turning it into a file (as opposed to a directory) is an error. 



• Laws 30-34 say when an operation can be combined with 
a previous operation. 

• Pairs 35, 36, and 6F, to which no laws apply, show sig- 
nificant constraints on non-breaking sequences: parents 
must be created before children; children must be re- 
moved before parents; and children must be removed 
before a directory can be made into a file. 

• Laws 37-43 say that any sequence containing break is 
equivalent to break. 

• The non-pair laws say that any sequence is at least as 
good as break and any sequence is at least as good as 
itself. 

• The inference rules say we can apply the laws within 
longer sequences, repeatedly if needed. 

Every pair law except law 3D can be used as a rewrite rule 
from left to right. 

Soundness and completeness 

The proof system in Table 1 is sound and complete. Infor- 
mally, soundness says that any conclusion we draw using 
the proof system is safe, and completeness says that any 
conclusion we draw using the underlying semantics can also 
(nearly) be drawn using the proof system. 
Formally the soundness result is this: 



5x C 5 2 



VF.5iF C S 2 F. 



The proof is straightforward, if a bit tedious, by induction 
on the proofs of judgments of the form Si C S2. We used 
automatic techniques to check the soundness of the alge- 
braic laws. 

Because of the possibility of commands that break the 
filesystem, our completeness result is not exactly what you 
might expect. We write Si || 52 (pronounced "Si and S2 
have a common upper bound") iff 3S : Si C S A S2 E 5. In 
other words, Si || S2 iff there is some sequence that is at 
least as good as both of them. In situations where neither 

51 nor S2 breaks the file system, Si, S2, and the upper 
bound all have the same effect. Our completeness result 
shows that if the effect of Si approximates the effect of 

52 on every possible filesystem, the two sequences have a 
common upper bound: 



(VF.SiF C S 2 F) 



S 2 . 
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Commuting or approximating pairs 

1. edit(n, X); edit (tt/tt' , Y) = edit(n/n', Y); edit(n, X) 

2. edit(n/n', Y); edit(n, X) = edit(n, X); edit(ir/Ti', Y) 
3Dg. edit(n, Dm(m)j; create(n/n' , Y) □ 

create(ir / tt' , Y); edit(ir, Dm(m)) 
4Dq. create (71-/71-', Y); edit(ir, Dm(m)) C 

edit(n, Dm(m)); create(iz / V , Y) 
5D. edit(ir, Dia(m)); remove(ir /tt') = 

remove(ir / tt' ); edit(TT,Dm(m)) 
6D. remove(TT / tt' ); edit(ir, DlR.(m)) = 

edit(ir, Dm(m)); remove(-iT / tt') 

7. edit(rT, X); edit(ip,Y) = edit(tp,Y); edit(n, X) 

8. edMjr, X); create(ip,Y) = create (tp,Y); edit(ir, X) 

9. edit(n, X); remove(ip) = remove(ip); edit(n, X) 

10. create ((p,Y); edit(ir,X) = edit(ir,X); create(tp,Y) 

11. create(iz, X); create(ip, Y) = create(ip, Y); create(n, X) 

12. create(rr, X ); remove(ip) = remove(tp); create(ir, X) 

13. remove((p); edit(n,X) = edit(ir, X); remove(ip) 

14. remo«e(i/)); create(n, X) = create (it, X); remove(ip) 

15. rernove(w); rernove(ip) = remove(tp); remove(n) 

Incorrect pairs 

3F. edit(n, FlLE(m, x)); create(ir /it' , Y) = break 
4F. create (71-/71-', Y); edit(n, FlLE(m, x)) = ireafc 
5F. edit(iz, FlLE(m, x)); remove{ir /it') = break 
edit(n, X); create(w, Y) = break 
edit(ir/ir' , X); create(ir,Y) = break 
edit (tt/tt 1 , X); remove (tt) = break 
create(-K , X)\ edit(-TT / tt' ,Y) = break 
create(-K, X); create(ir,Y) = break 
create(-K / tt' , X); create(n, Y) = break 
create (tt, X); remove^ /tt' ) = break 



16 
17 
18 
19 
20 
21 
22 



23. create (tt/tt', X); remove (tt) = break 

24. remove(ir); edit(ir,X) = break 

25. remove(iT); edit(ir / tt' , X) = break 

26. remove(ir); create(ir /tt' , X) = break 

27. remove(n /tt'); create(TT, X) = break 

28. remove(ir); remove(n) = break 

29. remove(Tr); remove (tt/tt 1 ) = break 

Simplifying laws 

30c. edit(TT,X);edit(TT,Y) C edit(-ir,Y) 

31. edit (tt, X); remove (tt) = remove(w) 

32. create(ir, X); edit(ir,Y) = create(iT,Y) 
33c- create(ir, X); remove(ir) C skip 
34^. remove(ir); create(n, X) jZ edit(ir,X) 

Break is idempotent 

37. break; edit(iT,X) = break 

38. break; create(n, X) = break 

39. break; remove(Tr) = break 

40. edit(ir,X); break = break 

41. create(ir, X); break = break 

42. remove(ir); break = break 

43. break; break = break 

Remaining pairs 

6F. remove(n /tt'); edit(iT, FlLE(m, x)) 

35. create(ir, X); create(n /tt' , Y) 

36. remove(ir /tt'); remove(ir) 

Non-pair laws 

Bottom, break jZ S for any S 
Reflexivity. S C S for any S 



Si ^ S2 S2 ^ •S's 
Si C 5 3 



(Transitivity) 



Si C S 2 



S;5i;5' C 5;S 2 ;S' 

N.B. Paths 7r and if> are always incomparable. Where we write 71-/71-', tt' is always nonempty. 



(Substitution) 



Table 1: Proof system for the filesystem algebra 



The implication is this: if there are two sequences of com- 
mands that have the same effect on every filesystem, we can 
find a third sequence that's at least as good as either of the 
first two — and therefore has the same effect on whatever 
filesystems don't break. We sketch the proof here; details 
will be relegated to an accompanying technical report. 

We divide the proof into two cases. Suppose first that 
VF.Si-F = _L, that is, Si breaks all filesystems. By identi- 
fying the shortest prefix of Si that has this property, and 
by reasoning about the last operation in that prefix, we 
can show Si = break, and break C S2 holds for any S2, so 
Si C S2 and S2 is the common upper bound. 

In the interesting case, 3F.SiF / _L, and SiF C S2F 
gives SiF = S2F ^ _L. We define minimal sequences 
by considering the sets ps = {S'\S C S'}, and we let 
S mm be any sequence in p$ of minimal length. The 
set ps is not empty because it contains S. We show that 

Si min F = ^rmnp ^ J_ and that break doeg not a p pe ar in 

either sequence. The proof of completeness has three steps. 

1. Because there is a filesystem that Si mm and S2" 11 " do not 
break, no law mentioning break applies. Because they are 
of minimal length, no simplifying law applies. We con- 
clude that in a minimal sequence, no path is mentioned 



more than once. 

2. The sequences Si nm and S 2 mm must contain exactly the 
same set of commands. The key insight is that a com- 
mand mentioning path it either breaks the filesystem or 
changes it only at tt. 

3. By applying commutative laws, we can rewrite Si mm and 
S2""" into a canonical sequence S. We use the follow- 
ing canonical ordering, which first orders commands by 
classes and then by pathname within class. 

(a) Commands of the form edit(ir, Dffi(m)), in any order 
determined by tt. 

(b) Commands of the form create(n , X) , in preorder. 

(c) Commands of the form remove(ir), in postorder. 

(d) Commands of the form edit(ir, FlLE(m, x)), in any or- 
der determined by tt. 

To rewrite sequences into this form, we may apply 
law 4D, so the strongest result we can get is Si C S □ S2, 
not equivalence. The canonical sequence S may be bet- 
ter than Si and S2, that is, it may be correct on more 
filesystems, but whenever Si or S2 works, S works and 
has exactly the same effect. 
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5. Using the algebra 

We have applied our algebra to the three steps of file syn- 
chronization: update detection, reconciliation, and conflict 
resolution. 

Update detection 

Typical filesystems don't keep logs of the operations that 
were performed on a filesystem; instead, we have to look at 
two states of a filesystem, Fi and F-, and find a minimal 
sequence of operations Si such that F[ — Si(Fi). We can 
do so by visiting all the non-_L paths in each filesystem. As 
shown in Figure 1, by comparing Fi(n) with F((tt), we can 
decide whether a create, remove, or edit has taken place. 
We could conceivably infer an edit operation for each path 
that is populated in both filesystems; this strategy corre- 
sponds to the "trivial update detector" mentioned by Bal- 
asubramaniam and Pierce (f998). But this strategy makes 
the cost of synchronization proportional to the size of the 
filesystem, not the size of what has changed. To do better, 
we need to know which paths have identical values in both 
filesystems; no edit operations are needed for such paths. 

Unfortunately, in typical use Fi represents the state of 
the filesystem at the last synchronization, F[ represents 
the current state, and we may wish not to keep a copy 
of Fi available indefinitely. 3 Even if we keep a copy, com- 
paring contents of files may be expensive. Accordingly, 
file synchronizers typically keep a snapshot of Fi, which 
is a copy of Fi that includes directory structure and meta- 
data but omits the contents of files. That is, the snapshot 
saves FlLE(m, _L) instead of FlLE(m, x). An alternative is to 
save FlLE(m, h(x)), where h is a fingerprinting hash func- 
tion (Broder 1993). The assumption is that in practice, 
we can avoid examining most contents because no opera- 
tion changes the contents of a file without also changing 
its metadata. The details of exactly what metadata might 
change are subtle; for example, because Unix filesystems 
can rename files without changing their modification times, 
looking at modification time alone can miss updates. Look- 
ing at both modification time and inode number suffices; 
Section 3 of Balasubramaniam and Pierce (1998) has de- 
tails. 

Once we have decided on the create, remove, and edit 
operations that are needed, we can put these operations 
into canonical order. Our completeness theorem tells us 
that the canonical sequence is at least as good as what 
actually happened. 

Reconciliation 

Balasubramaniam and Pierce (1998) characterizes the re- 
quirements on a synchronizer using two slogans: (1) prop- 
agate all non- conflicting operations and (2) if operations 
conflict, do nothing. The value of our approach is that it 
enables choices about what to do at a conflict; our second 
slogan is therefore (2) save conflicting operations for later 
resolution. 

We define conflicting operations using the minimal se- 
quences found by the update detector. Consider two com- 
mands Ci(ir) G Si and Cj(-y) G Sj, where i j, and 

3 Some operating systems, such as Plan 9, use write-once op- 
tical disks to make it cheap to reconstruct the state of a past 
filesystem (Thompson 1995), but such facilities are not common. 



Si and Sj are minimal sequences such that Fi = Si(F) and 
Fj = Sj(F). We say C»(7r) and Cj(-y) arc conflicting com- 
mands iff Cj £ Si and C\ ^ Sj and one of the following 
holds: 

• C» (7r) ; Cj (7) Jf Cj(j);Ci(-K), i.e., the commands do not 
commute. 

• C» (7r) ; Cj (7) = break or Cj{^); Cj(7r) = break, i.e., the 
commands break every filesystem. 

When Ci and C2 conflict, we write C\ 0 Ci. 

The reconciler takes the sequences Si, . . . , S„ that are 
computed to have been performed at each replica. It com- 
putes sequences 5*i, . . . ,S* n that make the filesystems as 
close as possible. The idea of the algorithm is that a com- 
mand C G Si should be propagated to replica j (included 
in Sj) iff three criteria are met: 

• C Sj, i.e., C has not already been performed at 
replica j 

• no commands at replicas other than i conflict with C 

• no commands at replicas other than i conflict with com- 
mands that must precede C 

A command C must precede command C iff they appear 
in the same sequence Si, C precedes C in Si, and they do 
not commute (C; C Jf C; C). 

Here is an example that shows why we consider conflicts 
on commands that must precede C. Suppose that in the 
original filesystem F(tt) — FlLE(m a; , x) and that we got two 
replicas by performing these commands: 

Fi = (edit(n, Dm(m)); create(n/n, FlLE(m m , w)))F 
F2 = edit(n, FlLE(m z , z)) F. 

Commands editijx, DlR.(m)) and edit(n, FlLE(m z , z)) do 
not commute, so they conflict. Therefore we cannot 
apply command edit(ii, Dm(m)) to replica 2. Because 
edit(-K, Dm(m)) must precede create(n/n, FlLE(m w ,w)), we 
cannot propagate the command create(-ir/n, FlLE(m™, to)) 
either. 

Given our three criteria, the reconciliation algorithm 
must be equivalent to the following: 

for i G l..n do 

make S* empty 
for i G l..n do 
for j G 1..TZ do 

for every command C G Si do 

if C should be propagated to replica j then 
append C to S* 

The algorithm is easily modified to compute the sets of 
conflicting commands Sf as well as the sequences S* . 

6. Implementation 
A prototype 

To verify that our algorithms can be implemented and that 
they work as we expect, we have written a prototype im- 
plementation. The program is about 700 lines of Perl, of 
which 300 lines are blank or comments. The program han- 
dles only two replicas, and it does not modify the filesystem; 
it simply computes the sequences Si and S|- Because it is 
a prototype, the program does not use a snapshot of the 
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filesystem; instead we give it a complete copy of the origi- 
nal. The prototype also takes a simplified view of metadata; 
for example, the metadata for a directory is reduced to a 
single bit, which tells whether the program has permission 
to write the directory. 

We have also started integrating our synchronization al- 
gorithm into the Unison synchronizer. 

Enlarging the algebra as seen by users 

We began with a rich collection of filesystem operations, 
then discovered it was easier to develop a useful algebra and 
a correct synchronization algorithm if we kept the number 
of operations small. Because "big" operations can make 
things clearer to the users, however, we recommend that a 
synchronizer introduce subtree and move operations — after 
computing the reconciling sequences S* and the conflicting 
operations Sf . 

Collapsing ordered operations 

In a minimal sequence, the only ordering constraints are 
those imposed by laws 3D, 21, and 29, as well as the pairs 
6F, 35, and 36. Informally, parents must be created before 
children, and children must be removed before parents. We 
can eliminate ordering constraints by collapsing create and 
remove operations into operations on their parents. The 
collapsed operations might be called create subtree, remove 
subtree, and edit into subtree. The "collapsed form" of a 
minimal sequence is convenient because it enables us to for- 
get about order, treating the sequence as a set. It should 
be helpful in a user interface, because the collapsed opera- 
tions seen by the user can be performed in any order. Not 
only are the subtree operations easier to understand, but 
if operations must be approved by users, as in the Unison 
synchronizer, the collapsed forms make it impossible for a 
user to approve an inconsistent set of operations (e.g., ap- 
proving the creation of a file without also approving the 
creation of its parent directory). 

Explicit move 

We recommend that a user interface use move, defined by 
move(ir, 7r') = remove^); create^', X), where X is the 
contents of the original filesystem at ir. A move subtree 
operation may also be useful. Because the algebraic laws 
governing move are complex, we recommend that move be 
introduced only after reconciliation, to describe either ac- 
tions to be taken or conflicting commands. Using move has 
three benefits. 

1. Performance. If an agent at one replica has moved a file 
from 7r to 7r', the instructions for performing the same 
action at other replicas need mention only the paths 
tv and 7r'. If we treat the move operation as a deletion 
and creation, the instructions sent to other replicas must 
include the full contents of the file. 

There are other solutions to this performance prob- 
lem. In particular, if the synchronizer retains a "finger- 
print" that uniquely identifies the contents of each file 
(Broder 1993), then one can build a transport layer that 
avoids sending the contents of any file whose contents 
are already available at another replica. But to realize 
the performance improvement, the synchronizer must be 
careful to send the create operation before the remove 



operation, lest contents that were available be discarded 
before they are needed. This ordering may conflict with 
orderings used in the user interface, e.g., lexicographic 
ordering by pathname, or ordering by type of operations 
at the convenience of the user. 

2. Retention of metadata. We wish to be able to synchro- 
nize replicas that reside under different operating sys- 
tems, such as Windows, Unix, and MacOS. Because 
each operating system has different metadata, it is in 
general impossible to preserve metadata when sending 
instructions between replicas under different operating 
systems. But there is an important special case, namely, 
a user running disconnected at F\ wishes to restructure a 
directory whose contents contain metadata representable 
only at i<2. If our algebra includes a move operation, we 
can propagate renaming operations from Fi to F2 with- 
out losing metadata that makes sense only at F2. If 
we do not have move, but must rely on create, we send 
back to F2 the results of a "best effort" to represent i^'s 
metadata on Fi , and we are likely to lose metadata like 
Windows access-control lists. A formal characterization 
of "best effort" would be worthwhile, but the problem is 
beyond the scope of this paper. 

3. Usability. The most important reason to keep move is 
to reduce the cognitive burden on users. The Unison 
synchronizer, for example, first decides on a set of trans- 
actions, then asks its users to approve them. 4 If a user 
is asked to approve a move operation, the user knows — 
from purely local information — that the contents of the 
renamed file will not be lost. But if the move is split into 
separate create and remove operations, these operations 
may be widely separated in the list of transactions; and 
a user wanting to be sure the remove is safe must inspect 
the entire list. 

A move command also eliminates the possibility of an 
error in which a user approves the remove but not the 
corresponding create, resulting in loss of contents at one 
replica. 

It may surprise you that if a user moves a subtree, we 
introduce many remove / create pairs, let them all partici- 
pate in reconciliation, then combine them into move subtree. 
We considered including move operations in the algebra and 
handling them during reconciliation, but we believe the sim- 
plicity of our current technique outweighs the possible loss 
of efficiency. For today's Unix and Windows filesystems, 
the question is moot; the filesystems don't log move oper- 
ations, and the only way to tell that a subtree has been 
moved is to reconstruct the move from individual remove 
and create operations. 

Alternatives for resolving conHicts 

After computing the reconciling sequences S* , a synchro- 
nizer should apply those sequences to the replicas (possibly 

4 Unison's transactions do not resemble the operations advo- 
cated in this paper. Instead, Unison offers three choices: make F\ 
like F2, make F2 like F\, or do nothing. Interestingly, Unison's 
update-detection algorithm uses the operations in this paper (re- 
move, create, edit, and skip), and it suggests a transaction based 
on what operation was performed at each replica. To help the 
user make a decision, Unison presents these operations in a sim- 
plified form. This form does not distinguish create from edit, 
and it collapses subtree operations as described above. 
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subject to a user's approval). But what should a synchro- 
nizer do with conflicting commands Sfl The freedom to 
decide this question is a significant advantage of our ap- 
proach. We make the following assumptions, which are 
consistent with Balasubramaniam and Pierce (1998): 

• If there are no conflicts, the replicas are identical after 
synchronization. 

• Even in the presence of conflicts, the synchronizer pre- 
serves the knowledge of what changes were made by 
users. (The sequences Sf represent this knowledge.) 

• If conflicts occur, a human being must intervene to put 
the filesystem (one or more replicas) into a desirable 
state. We call this intervention repairing the filesystem. 

We have identified three kinds of alternatives for dispos- 
ing of conflicting commands. We characterize them by look- 
ing at what kinds of repair mechanisms they enable. 

• Discard conflicting commands. Under this alternative, 
repairs require simultaneous access to all replicas, since 
the knowledge of conflicting changes made by users is 
preserved only at the replicas at which the changes were 
made. This alternative is forced by the state-based spec- 
ification of Balasubramaniam and Pierce (1998). 

• Propagate information about conflicting commands to all 
replicas. If the synchronizer somehow records, at every 
replica, all the sequences {Sf}, it becomes possible to 
perform disconnected repairs. By this we mean that no 
matter what the state of any replica, the following sce- 
nario is possible: 

1. A synchronization is initiated (by human or other 
agency) , and the synchronizer runs without human in- 
tervention. 

2. The replicas are disconnected. 

3. A human being repairs a single replica, leaving the 
other replicas unchanged. This repair would use the 
information recorded about {Sf}. Getting access to 
this information might require a special user interface. 

4. The replicas are reconnected, a second synchronization 
( "resynchronization" ) is initiated, and it runs without 
human intervention. 

5. The two replicas are identical. 

• Transform conflicting commands so they no longer 
conflict, and apply the transformed commands at each 
replica. This alternative is a special case of the previous 
one, in which the synchronizer takes the information 
about conflicting commands and somehow encodes that 
information in the filesystem, e.g., by changing the 
pathnames used in the conflicting commands. Ideally, 
after synchronization, all replicas would be identical. 
Users could then diagnose conflicts and repair the 
filesystem running disconnected, at any replica, using 
only ordinary commands. 

Encoding conflicts in the file system may be confusing, 
but making all replicas identical has compensating ad- 
vantages. 

— A user can determine the states of all replicas by ex- 
amining a single replica. 

— A user need not remember what conflicts occurred at 
the most recent synchronization, because those con- 
flicts manifest themselves as contents of the file sys- 
tem. 



— Once a single replica has reached a desirable state, 
work can proceed at that replica even without resyn- 
chronization. 

We believe that a file synchronizer intended to support mo- 
bile computing should support disconnected repairs. It is 
an open question whether it is better to support such re- 
pairs using a special user interface or to encode information 
about conflicts in the filesystem (leaving all replicas identi- 
cal after synchronization). 

Metadata and modi£cation times 

Users have a right to expect that a synchronizer will propa- 
gate a file's metadata as well as its contents. Most metadata 
can be propagated without difficulty, but because clocks 
at different replicas may show different times, propagating 
modification times can cause problems. Here are some re- 
quirements on timestamps: 

1. If the synchronizer thinks two replicas of a file are identi- 
cal, those replicas should bear identical timestamps. This 
requirement ensures that the files are treated as identical 
by other synchronization tools, by Make, by find, etc. 

2. When copying files from one replica to another, synchro- 
nization should not change the relative order of the times- 
tamps. This requirement preserves the correct behavior 
of Make. An early version of Unison used the time of syn- 
chronization as the modification time, sometimes leading 
Make to treat obsolete files as up to date. 

3. Timestamps at a single replica should be such that, if a 
user waits for one time unit to pass, then modifies or 
creates a file, that file will bear a modification time that 
is greater than the modification time of any other file at 
that replica. This requirement is essential for Make to 
function correctly. If it is violated (e.g., because the sys- 
tem clock gets out of whack) the problem can be difficult 
to diagnose. 

4. The outcome of a synchronization should depend only 
on the state of the two file systems being synchronized, 
not on the time at which the synchronization takes place. 
Synchronization itself should not be seen as an operation 
on the filesystem, only as a way of propagating existing 
operations. 

Requirements 2 and 3 are satisfied if this more general con- 
dition holds: If a user performs creation and modification 
operations at both replicas, and if these operations are to- 
tally ordered, then after the synchronizer runs, the times- 
tamps on synchronized files respect this total order. "Totally 
ordered" means not only ordered in real time, but ordered 
up to the ability of the local system to distinguish the ac- 
tions. If a user changes two files 10 milliseconds apart, and 
time stamps have a granularity of one second, these two 
actions are not totally ordered. 

The local clock provides an adequate total ordering for 
events at one replica, no matter what rate it runs at, pro- 
vided it runs forward. The awful truth is that there is 
no way to tell when events at different replicas should be 
totally ordered, even when users take care to order them. 
As noted in Section 2, even if there is a global clock, we 
can't rely on it, because we can't know post hoc whether 
operations ordered in time were so ordered intentionally or 
accidentally. 
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If there is no consistent global clock, as is typically the 
case, the problems get worse; in the presence of clock skew, 
the conditions above cannot all be satisfied simultaneously. 
For example, if replica F\ is running an hour ahead of 
replica F 2 , then changes to files modified within the last 
hour cannot be propagated to F2 without either giving them 
different time stamps or violating the total ordering. We 
believe it is better to give them different time stamps.'' If 
the time skew is small, it may be even better to freeze syn- 
chronization for a few seconds, allowing the clock at F2 to 
catch up with the latest modification time at F\. A for- 
mal study of synchronization in the presence of clock skew 
might yield more convincing recommendations. 

Many of these problems would be solved if the filesystem 
used vector clocks (Fidge 1988; Mattern 1989) to create 
timestamps for modification times. Unfortunately, such a 
plan would require sweeping changes in both operating sys- 
tems and program-development tools. For example, using 
a vector clock, a derived file could be not only out of date 
or up to date, but "concurrent" with respect to a source 
file. Make would have to be modified to deal such new 
relationships. 

7. Related work 
Merging 

File synchronization is closely related to the problem of 
merging unrelated changes to an object. This problem has 
been studied extensively in the context of software config- 
uration management (Conradi and Westfechtel 1998, §5.5), 
in which the objects may be single files, programs, parse 
trees, databases, etc.; in file synchronization, the filesystem 
is the "object" to which changes have been made. 

Our approach is closest to that of Lippe and van Oos- 
terom (1992), which advocates reasoning about sequences 
of operations, not just initial and final states. The setting 
is general and abstract; the CAMERA tools work with ar- 
bitrary state and operations, exploiting only commutative 
laws. The paper describes algorithms for finding and resolv- 
ing conflicts efficiently, even in cases where it is expensive 
to compare two operations and determine if they commute. 
It identifies three kinds of conflict-resolution policies: drop 
conflicting commands, impose an ordering on conflicting 
commands, and edit the merged sequence of commands in 
an arbitrary way. 

Kermarrec et al. (2001) describes IceCube, another gen- 
eral tool. Unlike CAMERA, IceCube does not use com- 
mutative laws to determine permissible orderings of com- 
mands; instead, it uses ordering constraints, which deter- 
mine when one operation may follow another in a merged 
sequence. The ordering constraints that apply to a pair of 
operations may depend on the state of the replica to which 
the operations are applied. There are no conflicting com- 
mands, and there is no conflict resolution as such; instead 
IceCube searches for a global ordering of operations that 
satisfies all constraints. To reduce the size of the search 
space, IceCube uses special "static" constraints, which are 
independent of the states of the replicas; the absence of such 



5 Even in this case, a synchronizer might well have to wait 
one tick at F2 for every file synchronized, in order to respect the 
total order without creating any files "newer than now." 



a constraint may be considered a sort of tentative commu- 
tative law. The performance of and results produced by 
IceCube are very sensitive to the choices of constraints and 
the division into static and dynamic constraints. 

Among special-purpose tools, the one most relevant to 
file synchronization appears to be the IPSEN merge tool 
(Westfechtel 1991), an operation-based tool in which the 
objects to be merged are abstract-syntax trees and the op- 
erations are tree-editing operations. No laws are given; in- 
stead, Westfechtel presents a merging algorithm. The paper 
includes an informal description of an extension that can de- 
tect and correct conflicts that involve bindings of identifiers. 
It is not clear whether this tool could be adapted to work on 
filesystems, but the question is interesting because the ex- 
tension might provide some hints about resolving conflicts 
in filesystems that include hard and soft links. 

Although file synchronization is an instance of the general 
merging problem, it has two distinguishing characteristics: 

• It is very cheap to compare two operations to see if they 
commute. 

• Synchronizers must work with the current states of the 
replicas. A synchronizer cannot edit a log, then replay 
that log from scratch. The "drop conflicts" or "impose 
an order" strategies (Lippe and van Oosterom 1992) are 
therefore impossible. 

Lippe and van Oosterom (1992) mentions that some op- 
erations may be "redundant," and that eliminating such op- 
erations may speed reconciliation and reduce conflicts. Our 
simplifying laws may be seen as a formal way of removing 
redundant operations. The particular laws we use enable us 
to put sequences of operations into canonical form, which 
greatly simplifies reconciliation. It is unclear to what extent 
these ideas might apply to a more general tool. 

ConHict detection 

We had expected our definition of conflicts, which uses con- 
flicting commands, to be equivalent to Unison's definition 
(Balasubramaniam and Pierce 1998). Our definition is ac- 
tually slightly stronger. That is, if our definition says there 
is a conflict, Unison's definition also detects a conflict, but 
there are cases in which Unison's definition detects a con- 
flict that our definition handles without conflict. These 
cases turn out to be uninteresting, however. 

Unison detects conflicts using dirty sets. Using our no- 
tation, an update detector applied to original filesystem F 
and replica Ft produces a set dirty i , which must satisfy two 
properties: 

• tv £ dirty i Fi(iv) = F(ir), i.e., clean files haven't 
changed 

• 7r/7r' G dirty i => ir £ dirty i , i.e., if a path is dirty its 
parent is dirty 

A dirty set is a safe estimate of paths where changes have 
been made; a good update detector computes the smallest 
possible dirty set. There is a dirty-set conflict at path n 
iff 7T € dirtyi ^ dirty • and Fi(%) 7^ Fj(iv) and either Fi(n) 
or Fj(n) is a file. (The specification in Balasubramaniam 
and Pierce ignores directory metadata, so all directories 
are considered identical. Unison's implementation does not 
ignore directory metadata.) 
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An example shows it is possible to have a dirty-set con- 
flict without having conflicting commands. Let the original 
filesystem and the two replicas be given by these equations: 

F = {/ i-» DlR.(m), /d i-» Dm(m), /d// i-» FlLE(m x , x)} 
Fi = (remove (/d/f ); remove(/d))F 
F 2 — (remove(/d/f))F. 
The least dirty sets must be 

dirt yi ={/,/d, /d/f} 
dirtj/a = {/, /d, /d/f} 

N.B. /d £ dirty 1 because replica 1 changed at /d, but /d G 
dirty 2 because /d/f G dirty 2 and parents of dirty paths are 
dirty. We have a dirty-set conflict at /d because it is dirty 
in both replicas and Fi(/d) is not a directory. 

Our algebra finds no conflict. Si = remove(/d/f ); 
remove(Jd) and S2 = remove(/d/±), so there arc no 
conflicting commands. In practice, we can safely apply 
remove(/d) to replica 2, so we believe this example should 
be considered non-conflicting. 

In the other direction, whenever there are conflicting 
commands, there is a dirty-set conflict. For consistency 
with Balasubramaniam and Pierce, we assume that all di- 
rectories have the same metadata and write simply Dm for 
directories. We assume we have unbroken filesystems F, 
Fi, and F2; the minimal sequences Si and Sj; and the dirty 
sets dirtyi and dirty j from the update detectors. Finally, we 
assume that the minimal sequences do not contain unneces- 
sary commands of the form edit(n, Dir.). That is, because 
all directories have the same metadata, if F(n) — Dir then 
the command edit(n, Dir) must not appear in Si or S2. 

If two commands conflict, one path must precede the 
other, since otherwise the commands would commute. 
Without loss of generality, we number the replicas to choose 
Ci(n) G Si and C2(ir/7t) G S2, where n may be empty, such 
that Ci(n) 0 Ci(v I 'tt) ■ We prove there is a dirty-set conflict 
at path tt. 

Because each sequence Si is of minimal length, we know 
that Fi(tt) / F(tv) and F 2 {it/tt) / F(ir/n). Therefore 
7r G dirty 1 and tv/tt G dirty 2 . Because dirty sets are closed 
under the parent relation, tt/tt G dirty 2 means tt G dirty 2 . 
What we have left to show is that Fi(tt) 7^ ^(ir), and in 
particular either F\{tt) or F2(n) is not a directory. 

Suppose that Fi(-7r) = ^2(71-) = Dir. Because Si is 
minimal, G\(tt) is the only command in Si that mentions 
path tt, and so Fi(ir) = (Ci(7r)F)(7r) = Dir. We conclude 
that Ci(n) must be either create(n, Dir) or edit(ir, Dir.). In 
either case we can be sure that F(n) 7^ Dir because other- 
wise edit(ir, ~Dir) could be removed from Si, contradicting 
our assumptions. By assumption, F2(7r) = Dir, so there 
must be a command in S2 that mentions tt; call it C 2 (7r). 
By similar reasoning C 2 {tt) must be either create(iT, Dir.) 
or editijr, Dir.), and since the replicas have the same ini- 
tial and final states at tt, in fact C\(tt) = C 2 (ir). But this 
forces G\(tt) G S2, which contradicts the assumption that 
Ci(tt) 0 C 2 (tt/tt). Therefore Fi(7r) and F 2 (tt) cannot both 
be directories. 

Similar reasoning shows that Fi(7r) 7^ F2(7r), and there- 
fore we have a dirty-set conflict at tt. 

Other synchronizers 

Space limitations preclude a thorough discussion of other 
synchronizers here. Commercial file synchronizers include 



Microsoft's Briefcase (Schwartz 1996; Microsoft 1998) and 
Leader Technologies' PowerMerge. Puma Technologies' In- 
telliSync solves a related problem: synchronizing various 
kinds of database files used in handheld and other com- 
puters (Puma a; Puma b). In addition to the Unison 
synchronizer (Balasubramaniam and Pierce 1998), there 
is an experimental synchronizer developed by the Ru- 
mor project (Reiher et al. 1996). Balasubramaniam and 
Pierce (1998) discusses some of these synchronizers, as well 
as connections to research in distributed file systems and 
databases. There is also the more recent Reconcile syn- 
chronizer (Howard 1999). 

The synchronizers listed above synchronize all replicas at 
once, propagating operations from every replica to every 
other. Cox and Josephson (2001) describes Tra, a synchro- 
nizer that can defer some propagations to later synchroniza- 
tions, or even indefinitely. It works by using a variation on 
vector clocks to identify conflicts and to determine what 
operations should be propagated. 

8. Discussion 

Balasubramaniam and Pierce (1998) specifies a file synchro- 
nizer by presenting preconditions and postconditions for the 
states of two filesystems before and after synchronization. 
Although these conditions completely determine a synchro- 
nization algorithm, we hope to have convinced you that 
other postconditions might be equally desirable, or possi- 
bly even more desirable. By reasoning about an algebra of 
operations instead of states, we have shown that there can 
be a family of specifications for file synchronizers, each of 
which could be considered correct. Different members of the 
family might offer different tradeoffs in their treatments of 
conflicting commands. Our algebraic approach illuminates 
the design space. 

Because there are many different ways to formulate 
filesystem operations, we have taken care to give not only 
algebraic laws, but also an underlying model, and to show 
that the laws form a sound and complete proof system for 
that model. Although this style of specification is more 
elaborate than simply appealing directly to the algebra and 
its laws, it helps deal with a central problem of formal spec- 
ification: ensuring the specification accurately describes the 
intended behavior. An implementor or a user can look at 
Figure 2 and say, "y es , that is a filesystem and its oper- 
ations." It is much more difficult to say whether Table 1 
describes a filesystem. 

Our algebra is carefully crafted so we can take any two 
states of a file system and construct a canonical, minimal 
sequence of operations that connects the states. For exam- 
ple, our edit operation uses the final contents of a file, not 
the delta, and our algebra lacks a move operation. It is not 
clear whether an equally useful algebra can be crafted to 
solve other kinds of reconciliation problems. 

We hope our techniques may apply to other algebras. For 
example, mail systems such as MH use filesystems to hold 
electronic mail. Directories represent mail folders, and files 
represent messages. File names represent message num- 
bers. The message numbers themselves are not important. 
More precisely, although message numbers at an individ- 
ual replica should not be changed gratuitously, it might be 
acceptable to have different message numbers at different 
replicas, and it might be acceptable if message numbers 
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changed as a result of synchronization. 

The mail-folder algebra corresponds not to filesystem op- 
erations but to mail-handling commands: rmm, which re- 
moves a message; ref ile, which moves a message between 
folders; and inc, which accepts delivery of new messages. 
Such commands assign message numbers and maintain in- 
ternal invariants, e.g., the integrity of . mh_sequences. One 
may also see a rare edit operation, e.g., to patch botched 
headers, to reformat unreadable content created by Mi- 
crosoft products, etc. A critical difference in the mail alge- 
bra is that messages should be identified not by pathname 
but by contents. For messages that conform to RFC 822, 
the value of the Message-Id field can stand in for the con- 
tents. Our synchronization algorithm and proof techniques 
may nevertheless apply to this new algebra. 

Existing synchronizers are either ill-specified (many of 
the commercial tools) or inflexible (Balasubramaniam and 
Pierce 1998). An algebraic approach seems to offer a nat- 
ural and understandable path to specification and imple- 
mentation of a file synchronizer, but the real potential ad- 
vantages lie in two areas. 

• Whereas an approach based on states leads to a single 
conflict-resolution policy, our algebraic approach sup- 
ports several alternatives, including alternatives that 
support disconnected repairs. 

• An algebraic approach may be useful for other synchro- 
nization problems, such as synchronizing mail folders, 
PalmOS databases, or other kinds of files with internal 
structure. 

In the long run, it may even be possible to build a syn- 
chronizer that is parameterized by an algebra, an update 
detector, and a conflict resolver. Perhaps one could extend 
such a synchronizer without having to prove the whole thing 
correct; instead, one could limit one's effort to proving the 
soundness of the algebraic laws and of the update detector. 
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